Wikiwhere: An interactive tool for studying the geographical provenance of Wikipedia references

نویسندگان

  • Martin Körner
  • Tatiana Sennikova
  • Florian Windhäuser
  • Claudia Wagner
  • Fabian Flöck
چکیده

Wikipedia articles about the same topic in different language editions are built around different sources of information. For example, one can find very different news articles linked as references in the English Wikipedia article titled “Annexation of Crimea by the Russian Federation” than in its German counterpart (determined via Wikipedia’s language links). Some of this difference can of course be attributed to the different language proficiencies of readers and editors in separate language editions; yet, although including English-language news sources seems to be no issue in the German edition, English references that are listed do not overlap highly with the ones in the article’s English version. Remarkably, the German version, compared to its English counterpart, includes a notably higher imbalance in favor of Russian sources against Ukrainian ones, and also a lesser overall ratio of Ukrainian and Russian sources in relation to the native language of the Wikipedia edition (cf. Figure 1) – although many of these pages are written in English and can be easily included in the German article. Such patterns could be an indicator of bias towards certain national contexts when referencing facts and statements in Wikipedia. However, determining for each reference which national context it can be traced back to, and comparing the link distributions to each other is infeasible for casual readers or scientists with non-technical backgrounds. Wikiwhere answers the question where Web references stem from by analyzing and visualizing the geographic location of external reference links that are included in a given Wikipedia article. Instead of relying solely on the IP location of a given URL our machine learning models consider several features.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modelling provenance of DBpedia resources using Wikipedia contributions

DBpedia is one of the largest datasets in the Linked Open Data cloud. Its centrality and its cross-domain nature makes it one of the most important and most referred to knowledge bases on the Web of Data, generally used as a reference for data interlinking. Yet, in spite of its authoritative aspect, there is no work so far tackling the provenance aspect of DBpedia statements. By being extracted...

متن کامل

مروری بر مطالعات اُبسیدین در ایران، منشأیابی معادن و اُبسیدین های محوطه های باستانی، پژوهش ها و پرسش های موجود

Obsidian artifacts is frequently used materials in prehistory and found widely in archaeological sites. Provenance studies of obsidian has been an issue of intense research and debate between archaeologists and geologists. Since different provenance studies has been carried out from 1960s up to 2015 in Anatolia and Caucasus but obsidian studies in Iran is in very early stage and consider as ter...

متن کامل

A New Perspective on Semantics of Data Provenance

Data Provenance refers to the “origin”, “lineage”, and “source” of data. In this work, we examine provenance from a semantics perspective and present the W7 model, an ontological model of data provenance. In the W7 model, provenance is conceptualized as a combination of seven interconnected elements including “what”, “when”, “where”, “how”, “who”, “which” and “why”. Each of these components may...

متن کامل

Semantic Representation of Provenance in Wikipedia

Wikis are often considered as being a wide source of information. However, identifying provenance information about their content is crucial, whether it is for computing trust in public wiki pages or to identify experts in corporate wikis. In this paper, we address this issue by providing a lightweight ontology for provenance management in wikis, based on the W7 model. Furthermore, we showcase ...

متن کامل

Collecting Provenance in an Interactive Scripting Environment

Scientific data provenance is often cited as a valuable tool for scientists to use to document their data collection and analysis processes, allowing improved understanding and sharing of data and results. However, most software that supports data provenance requires scientists to adopt new technologies rather than adding these capabilities to technologies that scientists already use. In this p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1612.00985  شماره 

صفحات  -

تاریخ انتشار 2016